{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 说明\n",
    "\n",
    "请按照填空顺序编号分别完成 参数优化，不同基函数的实现"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "def load_data(filename):\n",
    "    \"\"\"载入数据。\"\"\"\n",
    "    xys = []\n",
    "    with open(filename, 'r') as f:\n",
    "        for line in f:\n",
    "            xys.append(map(float, line.strip().split()))\n",
    "        xs, ys = zip(*xys)\n",
    "        return np.asarray(xs), np.asarray(ys)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 不同的基函数 (basis function)的实现 填空顺序 2\n",
    "\n",
    "请分别在这里实现“多项式基函数”以及“高斯基函数”\n",
    "\n",
    "其中以及训练集的x的范围在0-25之间"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def identity_basis(x):\n",
    "    ret = np.expand_dims(x, axis=1)\n",
    "    return ret\n",
    "\n",
    "def multinomial_basis(x, feature_num=10):\n",
    "    '''多项式基函数'''\n",
    "    x = np.expand_dims(x, axis=1) # shape(N, 1)\n",
    "    #==========\n",
    "    #todo '''请实现多项式基函数'''\n",
    "    #==========\n",
    "    ret = None\n",
    "    return ret\n",
    "\n",
    "def gaussian_basis(x, feature_num=10):\n",
    "    '''高斯基函数'''\n",
    "    #==========\n",
    "    #todo '''请实现高斯基函数'''\n",
    "    #==========\n",
    "    ret = None\n",
    "    return ret"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 返回一个训练好的模型 填空顺序 1 用最小二乘法进行模型优化 \n",
    "## 填空顺序 3 用梯度下降进行模型优化\n",
    "> 先完成最小二乘法的优化 (参考书中第二章 2.3中的公式)\n",
    "\n",
    "> 再完成梯度下降的优化   (参考书中第二章 2.3中的公式)\n",
    "\n",
    "在main中利用训练集训练好模型的参数，并且返回一个训练好的模型。\n",
    "\n",
    "计算出一个优化后的w，请分别使用最小二乘法以及梯度下降两种办法优化w"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def main(x_train, y_train):\n",
    "    \"\"\"\n",
    "    训练模型，并返回从x到y的映射。\n",
    "    \n",
    "    \"\"\"\n",
    "    basis_func = identity_basis\n",
    "    phi0 = np.expand_dims(np.ones_like(x_train), axis=1)\n",
    "    phi1 = basis_func(x_train)\n",
    "    phi = np.concatenate([phi0, phi1], axis=1)\n",
    "    \n",
    "    \n",
    "    #==========\n",
    "    #todo '''计算出一个优化后的w，请分别使用最小二乘法以及梯度下降两种办法优化w'''\n",
    "    #==========\n",
    "    \n",
    "    def f(x):\n",
    "        phi0 = np.expand_dims(np.ones_like(x), axis=1)\n",
    "        phi1 = basis_func(x)\n",
    "        phi = np.concatenate([phi0, phi1], axis=1)\n",
    "        y = np.dot(phi, w)\n",
    "        return y\n",
    "\n",
    "    return f"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 评估结果 \n",
    "> 没有需要填写的代码，但是建议读懂"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate(ys, ys_pred):\n",
    "    \"\"\"评估模型。\"\"\"\n",
    "    std = np.sqrt(np.mean(np.abs(ys - ys_pred) ** 2))\n",
    "    return std\n",
    "\n",
    "# 程序主入口（建议不要改动以下函数的接口）\n",
    "if __name__ == '__main__':\n",
    "    train_file = 'train.txt'\n",
    "    test_file = 'test.txt'\n",
    "    # 载入数据\n",
    "    x_train, y_train = load_data(train_file)\n",
    "    x_test, y_test = load_data(test_file)\n",
    "    print(x_train.shape)\n",
    "    print(x_test.shape)\n",
    "\n",
    "    # 使用线性回归训练模型，返回一个函数f()使得y = f(x)\n",
    "    f = main(x_train, y_train)\n",
    "\n",
    "    y_train_pred = f(x_train)\n",
    "    std = evaluate(y_train, y_train_pred)\n",
    "    print('训练集预测值与真实值的标准差：{:.1f}'.format(std))\n",
    "    \n",
    "    # 计算预测的输出值\n",
    "    y_test_pred = f(x_test)\n",
    "    # 使用测试集评估模型\n",
    "    std = evaluate(y_test, y_test_pred)\n",
    "    print('预测值与真实值的标准差：{:.1f}'.format(std))\n",
    "\n",
    "    #显示结果\n",
    "    plt.plot(x_train, y_train, 'ro', markersize=3)\n",
    "#     plt.plot(x_test, y_test, 'k')\n",
    "    plt.plot(x_test, y_test_pred, 'k')\n",
    "    plt.xlabel('x')\n",
    "    plt.ylabel('y')\n",
    "    plt.title('Linear Regression')\n",
    "    plt.legend(['train', 'test', 'pred'])\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "source": [],
    "metadata": {
     "collapsed": false
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}